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Abstract — The factor graph approach to discrete-time linear 
Gaussian state space models is well developed. The paper extends 
this approach to continuous-time linear systems /filters that are 
driven by white Gaussian noise. By Gaussian message passing, 
we then obtain MAP /MMSE/ LMMSE estimates of the input 
signal, or of the state, or of the output signal from noisy obser- 
vations of the output signal. These estimates may be obtained 
with arbitrary temporal resolution. The proposed input signal 
estimation does not seem to have appeared in the prior Kalman 
filtering literature. 



I. Introduction 

Consider the system model shown in Fig. [T] a continuous- 
time linear time-invariant system /filter is fed by a continuous- 
time signal U(t). The system output Y(t) is sampled (at 
regular or irregular intervals) and the samples are corrupted 
by discrete-time additive white Gaussian noise. From the 
noisy samples Y^, we wish to estimate the clean samples 
Yfe, or the clean signal Y(t) at arbitrary instants t, or the 
state trajectory of the system, or — of particular interest in this 
paper — the input signal U(t) at arbitrary instants t. We will 
not assume that any of these signals is bandlimited (in the 
strict sense required by the sampling theorem); instead, the 
key assumption in this paper is that the given linear system 
has a finite-dimensional state space representation. 

Problems of this kind are ubiquitous. For example, Fig. [T] 
might model an analog-to-digital converter with a non-ideal 
anti-aliasing filter and with quantization noise Zk\ indeed, this 
application is a main motivation for this paper. As another 
example, Fig. [T] might model a sensor with some internal 
dynamics which limits its temporal resolution of the desired 
quantity U(t). In both examples, we are primarily interested 
in estimating the input signal U(t). 

We will address these estimation problems under the further 
assumption that the input signal U(t) is white Gaussian 
noise. It might perhaps seem at first that this assumption is 
problematic when U(t) is actually the signal of interest, as in 
the two mentioned examples. However, we will argue that this 
assumption is meaningful in such cases and that the LMMSE 

Lukas Bolliger and Hans-Andrea Loeliger are with the Dept. of 
Information Technology and Electrical Engineering, ETH Zurich, CH- 
8092 Zurich, Switzerland. Email: loeliger@isi.ee.ethz.ch, 
lukas@bolligernet . ch. 

Christian Vogel is with the Telecomm. Research Center Vienna 
(FTW), Donau-City-Strasse 1, A-1220 Vienna, Austria. Email: 
c.vogel@ieee. org. 

An abbreviated version of this paper was presented at the 2010 Information 
Theory & Appl. Workshop (ITA), La Jolla, CA, Feb. 2010 (TJ. 



r-n U(t) 






Y{t) 


1 1 









Y, 



4> 



WGN 
source 



Y, 



source 



linear 
system / filter 



Fig. 1. System model. 



(linear minimum mean squared error) estimate of U(t) is well 
defined and useful. An example of such an LMMSE estimate 
of U (t) is shown in Fig 
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The nature of this estimate will 
further be illuminated by reformulating it as a regularized 
least-squares problem with a penalty term J u(t) 2 dt, as will 
be discussed in Section fVl 

The assumption that U(t) is white Gaussian noise turns 
our system model (Fig. [T| into a linear Gaussian model, and 
LMMSE estimation of the state trajectory or of the clean 
output signal Y(t) amounts essentially to Kalman filtering 
(or rather Kalman smoothing) pl-p). However, estimation of 
the continuous -time input signal U(t) does not seem to have 
been addressed in the Kalman filtering literature. 

We will also consider some extensions of the system model 
including time-varying systems, vector signals, and systems 
with internal noise sources. These extensions are required in 
some of the motivating applications, but the extensions are 
straightforward and standard in Kalman filtering. 

We will address these estimation problems (as described 
above) using factor graphs. Factor graphs [9 |-[12] and similar 
graphical models [13|-[16| allow a unified description of 
system models and algorithms in many different fields. In par- 
ticular, Gaussian message passing in factor graphs subsumes 
discrete-time Kalman filtering and many variations of it |9)- 
p2| . The graphical-model approach has facilitated the use 
of these techniques as components in more general inference 
problems and it has become a mode of teaching discrete-time 
Kalman filtering itself. 

In this paper, we extend the factor graph approach to 
continuous -time models with discrete-time observations as 



'Note that the Kalman-Bucy filter [3] addresses the different situation where 
the observations are continuous-time signals as well. 
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described above. This extension appears to be new]^] and it sig- 
nificantly enlarges the domain of graphical models. We note, in 
particular, that the LMMSE estimates of the continuous-time 
signals associated with such models (such as U(t) and Y(t) in 
Fig. [TJ become computational objects that can be handled with 
arbitrary temporal resolution by Gaussian message passing. 

Applications of the methods of this paper (in addition to 
those already mentioned) have been reported in JT7) and 
p8) . In (T7) , a new method for sampling jitter correction 
is proposed that uses the slope of Y(t), which is available 
in the state space model, in an iterative algorithm. In |18|, 
a new approach to analog-to-digital conversion is proposed 
which combines unstable analog filters with digital estimation 
of U(t) as proposed in the present paper. Both of these 
applications build on |T) (which does not contain the proofs) 
and rely on the present paper for a full justification of the 
proposed algorithms. Further applications (including beam- 
forming with sensor arrays and Hilbert transforms) will be 
reported elsewhere. 

In summary, this paper 

• extends the factor graph approach to continuous-time 
models as in Fig. [T[ 

• extends Kalman smoothing (forward-backward Gaussian 
message passing) to the estimation of input signals as 
U(t) in Fig. [1] 

• provides the necessary background for subsequent work 
such as (17) and (18). 

The paper builds on, and assumes some familiarity with, 
the factor graph approach to discrete-time Kalman filtering as 
given in (TTJ. 

The paper is structured as follows. The system model is 
formally stated in Section [II] and represented in factor graph 
notation in Section 
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State estimation and output signal 
estimation are then essentially obvious, but some pertinent 
comments are given in Section [IV] Estimation of the input 
signal is discussed in Section [V] In Section VI the estimation 
algorithms are illustrated by some simple numerical examples. 
A number of extensions of the basic system model are outlined 



in Section VII and Section VIII concludes the paper. 

The following notation will be used: x denotes the complex 
conjugate of x; A T denotes the transpose of the matrix A; 
A H = B denotes the Hermitian transpose of A; I denotes an 
identity matrix; "oc" denotes equality up to a constant scale 
factor; J\f(m,<r 2 ) or Af(m,V) denotes a normal (Gaussian) 
distribution with mean m and variance a 2 , or with mean vector 
m and covariance matrix V, respectively. 

II. System Model 

Let X £ K™ be the state of a linear system (as, e.g., in 
Fig. [T| which evolves in time according to 

X(t)=AX(t) + bU(t) (1) 

where X denotes the derivative with respect to time and where 
both the matrix A £ ]R nx ™ and the vector b £ W l are known. 

2 Another extension of graphical models to continuous time are continuous- 
time Bayesian networks |19| - |21| , where the emphasis is on finite-state 
models and approximate inference. Yet another such extension is [22], where 
linear RLC circuits are described in terms of factor graphs. 



The system output is the discrete-time signal Y\, Yj, ■ ■ ■ 6R" 
with 



Y k = CX{t k ) 



(2) 



where t±,t2, ■ ■ ■ £ M (with tk-x < tk) are discrete instants of 
time and where C G R" x " is known. We will usually observe 
only the noisy output signal Y\ , Y%, . . . defined by 



Y k = Y k + Z k 



(3) 



where Zx, Z 2 , ■ ■ ■ (the noise) are independent Gaussian random 
variables, each of which takes values in W and has a diagonal 
covariance matrix Vz- 

The (real and scalar) input signal U(t) will be modeled as 
white Gaussian noise, i.e., for t < t' , the integral 



U(t) dr 



(4) 



is a zero-mean Gaussian random variable with variance 
cr^jit' — t), and any number of such integrals are independent 
random variables provided that the corresponding integration 
intervals are disjoint. In consequence, it is appropriate to 
replace ([T]i by 



dX(t) = AX(t) dt + b U(t) dt 



(5) 



where U(t) dt is a zero-mean Gaussian random variable with 
infinitesimal variance afj dt. 

As stated in the introduction, we will argue later (in 
Section |v| that modeling U(t) as white Gaussian noise is 
meaningful even when U(t) is a (presumably smooth) signal 
of interest that we wish to estimate. 

For any fixed initial state X(to) = x(to), equation |5]) 
induces a probability density /(a;(ti)|x(fo)) over the possible 
values of X(tx) (where t and tx are unrelated to the discrete 
times {tk} in Specifically, integrating |5]) from t = to to 
t\ > to yields 

X(t x ) = e AT X(t ) + f e A{T -^bU{to + r)dT (6) 
Jo 

with T = t\ — to > 0. If U(t) is white Gaussian noise (with 
<7ij as above), then the integral in (|6jl is a zero-mean Gaussian 
random vector with covariance matrix]^] [23 1-|25 1 



V S = u% f e A ^bb T (e A ^) T dr 
Jo 

= al\ e AT bb T (e AT ) T dT. 
Jo 



(7) 
(8) 



It is thus clear that, for fixed X(to) = x(to), X(ti) is a 
Gaussian random vector with mean e AT x(to) and covariance 
matrix Vs, i.e., 

f(x(tx)\x(t )) oc e-M^O-^MX-^O-^M. 

(9) 



This covariance matrix is essentially the controllability Gramian. However, 
controllability is not required in this paper. 
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Fig. 2. Factor graph of the system model with observations Yj, = jjk- 
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Fig. 4. Splitting the node / factor f {x(tk+i)\z(t k )j to access the state at 
an intermediate point in time t' . 



Each of the factors in Fig. [4] can, of course, be replaced by 
the corresponding decomposition according to Fig. [3] 

Note that the input signal U(t) is not explicitly repre- 
sented in Figures [2]-^] For the estimation of U(t), we will 
therefore need another decomposition of the node /factor 
f(x(t k+ i)\x{t k )). 
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Fig. 3. Factor graph of /(a;(ti)|a;(to)) according to J6j— J8J. (The values 
of to and t\ are not restricted to the discrete times {tfe} in Fig. ri]) 



III. Factor Graph of System Model 

We will use Forney factor graphs (also known as normal 
factor graphs |26|) as in (TO") and fTT) . The nodes /boxes in 
such a factor graph represent factors and the edges in the graph 
represent variables. 

In this notation, the system model of Section III] may 
be represented by the factor graph shown in Fig. [2] More 
precisely, Fig. [2]represents the joint probability density of the 
variables in the system model at discrete times t 1 ,t 2 ,- ■ ■■ 

Note that Fig. [2] shows only a section (from t k to tk+i) of 
the factor graph; the complete factor graph starts at time to 
and ends at some time tjc , and it may contain additional nodes 
to represent any pertinent initial or final conditions. Note also 
that, apart from the Gaussian nodes/factors f (^x(t k +i)\x(tk)) 
and jV(0, Vz), the nodes /boxes in Fig. [2] represent linear 
constraints. 

For details of this factor graph notation, we refer to fTT) . 

As shown in Fig. [2] the function (|9|l can immediately be 
used as a node in a factor graph. However, the function (|9]l can 
itself be represented by nontrivial factor graphs. A first such 
factor graph is shown in Fig. [3] which corresponds to |6]h[8]i. 
Plugging Fig. [3] into Fig. [2] results in a standard discrete-time 
linear Gaussian factor graph as discussed in depth in (TTJ. 

The factor graph of Fig. [2]is easily refined to arbitrary tem- 
poral resolution by splitting the node /factor f(x(tk+i)\ x(tk)) 
as shown in Fig. |4] In this way, both the state X(t) and the 
output signal Y(t) — CX(t) become available for arbitrary 
instants t between tk and tk+i- 



IV. Gaussian Message Passing, State Estimation, 
and Output Signal Estimation 

Having thus obtained a discrete-time factor graph (with an 
arbitrary temporal resolution), estimating X(t) or Y(t) from 
the noisy observations Y\=yx, Y2 = y%, ... by means of 
Gaussian message passing is standard and discussed in detail 
in (TTJ (cf. also (10| and 1 12 1). We therefore confine ourselves 
to a few general remarks (mostly excerpted from 1 10 1 and (TJJ) 
and some additional remarks on message passing through the 
node /factor f (x{t)\x(to)) ■ 

A. General Remarks 

1) In linear Gaussian factor graphs such as Figures 12^4] 
(where all factors are either Gaussians or linear con- 
straints), all sum-product messages are Gaussians and 
sum-product message passing coincides with max- 
product message passing. Moreover, MAP (maximum 
a posteriori) estimation coincides both with MMSE 
(minimum mean squared error) estimation and with 
LMMSE (linear /affine MMSE) estimation. 

2) In general, every edge in the factor graph carries two 
messages, one in each direction. Since all the edges in 
Figures [2|]4] are directed (i.e., drawn with an arrow), 
we can unambiguously refer to the forward message 
~jix and the backward message along the edge 
representing some variable X. 

3) Gaussian messages have the form 

fi(x) (xe-^ x - m)Jw{x ' m) ; (10) 

they are naturally parameterized by the mean vector 
in and either the matrix W or the covariance matrix 
V (= W' 1 ). Degenerate Gaussians, where either W 
or V do not have full rank, are often permitted and 
sometimes unavoidable; in such cases, only W or V, 
but not both, are well defined. We will use the symbols 
r&x and \^x (or V&x) to denote the parameters of 
the forward message (along some edge / variable X) 
and tnx and V ' x (or Wx) for the parameters of the 
backward message. 

4) The natural scheduling of the message computations in 
Fig. |2| consists of a forward recursion for ~~jlx(t k ) and 
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TABLE I 

Computation rules for Gaussian messages through 

NODE/FACTOR f(x(t\)\x(to)) WITH tl > to. 





X{t ) 




X(ti) 










f(x(t!)\x(t )) 










(1.1) 


V X {H) 


= e A(t 1 -to)^ x(to)e A 1 


(*i-*o) 






/-ti-to 

+ / 




T e ATT dr 


(1.2) 




Q~4(t 1 -t c 


)Q H see (H) 






= e -A( tl -t )^ (ti) 




(1.3) 


v X(t ) 


= e -A(t 1 -t )V x(ti)e - 


-A T (*!-*(,) 






/-ti-to 

+ 4 / 


e- AT bb T e- ATT dr 


(1.4) 




Q&(ti- 


t )Q H see {T4) 




u(t) 


= (Vx (< ) 


+ ^X(t)) (^X(t) ~ 


"tx(t)) 










(1.5) 



Both of 



an independent backward recursiorrjfor ^jlx(t k ) 
these recursions use the messages with parameters 
Wfc = Z/fc and WY k = V~ l (assuming Y k = y k is 
known; if Y k is not observed /unknown, then Wy k = 
and ^YkiVk) = D- 

5) Since the factor graph in Fig. [2] has no cycles, the 
a posteriori distribution of any variable X (or Y, Z, 
. . . ) in the factor graph is the product ~jtx ( x )px(x) of 
the corresponding two messages, up to a scale factor. 
The parameters of this marginal distribution are mx 
and Wx given by W x = W x + W x and Wxm x = 

W x rtix + Wx^nx- 

6) Tabulated message computation rules (in particular, Ta- 
bles 2-4 of fTT) ) allow to compose a variety of different 
algorithms to compute the same sum-product messages. 
The variety arises from different parameterizations of 
the messages and from local manipulations (including 
splitting and grouping of nodes) of the factor graph. 



B. Message Passing Through f (x(ti)\x{t$)} 

Gaussian message passing through the node /factor 
/(x(ii)|x(io)) is summarized in Ta ble p ] Both the forward 
message (with parameters ( |I.l) and ( OF ) and the backward 
message (with parameters dO| and (I.4i) are easily obtained 
from Fig. [3] and and Tables 2 and 3 of (TTJ. 

4 The backward recursion is required for smoothing, i.e., noncausal esti- 
mation or estimation with delay; it is absent in basic Kalman filtering as 
in |2|. In fact, while the backward recursion is obvious from the graphical- 
model perspective, its development in the traditional approach was not quite 
so obvious, cf. jSJ. 



If the matrix A is diagonalizable, then the integrals in (|I.2[) 



and (1.4 1 can easily be expressed in closed form. Specifically, 
if 

At \ 

A„ / 



A = Q 



(11) 



for some complex square matrix Q, then 

f e AT bb T e ATT dr = Q^(t)Q H 
Jo 

where the square matrix ~$(t) is given by 
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and 



- AT bb T e- ATT dr = Q^(t)Q H 



with 



^ .(Q^) fc (Q-^ fi _ e _ (W 
W ' X k + X e 



(12) 



(13) 



(14) 



(15) 



Note that, in ( 13 1 and \15\ , (Q~ 1 b)k denotes the k-th compo- 
nent of the vector Q b. The proof of ( fl2"} and ( fl4} is given 
in Appendix |A| 

The remaining entry (1.5 1 in Table [I] is Theorem [TJ of the 
next section. 

V. Input Signal Estimation and 
Regularized-Least-Squares Interpretation 

We now turn to estimating the input signal U(t) and to 
clarifying its meaning. To this end, we need the factor graph 
representation of f(x(ti)\x(to)) that is shown in Fig. 5] 
which corresponds to the decomposition of (|6) into N discrete 
steps and where T = t\ — to- Note that this factor graph 
is only an approximate representation of f(x(ti) | x(to)), but 
the representation becomes exact in the limit N —> oo. The 
variables U(t) in Fig. pi are related to U(t) by 



U(t) 



N 
T 



t-T/N 



U(t) dr, 



(16) 



i.e., U (t) is the average of U(t) over the corresponding inter- 
val. The proof of this decomposition is given in Appendix [B] 
For finite N, Fig. [5] is a standard linear Gaussian factor 
graph in which snapshots U(t) of U(t) according to (16 1 



appear explicitly and can therefore be estimated by standard 
Gaussian message passing. In the resulting expression for the 
estimate of U(t), we can take the limit N —> oo and thus 
obtain an estimate of U(t). 

Theorem 1. The MAP / MMSE / LMMSE estimate of U(t) 
from observations Y k = yk according to the system model of 
Section [II] is 

-l 



fi(t) 



,2 lT 



X(t] 



x(t) 



x(t)) 



(17) 



where r&x{()> » x(t)> an( l ^X(t)> *x(t) are tne parameters 
of the Gaussian sum-product messages as discussed in Sec- 
tion |IV] □ 
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AT 

e » 



X(ti) 



Fig. 5. Decomposition of the node/factor /(x(ti)|z(to)) into AT discrete 
time steps (with T = t\ — to). This representation is exact only in the limit 
TV — ^ oo. 



aa(o,4^) 



U(t) 



u'(t) 



X'it) 



X(t) 



Fig. 6. Factor graph used in the proof of Theorem [T] 

Proof: Consider the factor graph in Fig. [6] which shows 
the relevant part of Fig. [5] with suitably named variables. We 
determine the mean itlq^ and the variance W~\^ of the a 



posteriori distribution of U{t) as follows. From Hill eq. (54) 
and (III. 5)], we have 



w u(t) = Wfy {t) + W m 



,T_ 

N 



b T tv l7 , (t) b. 



From 1 11 eq. (55)], we then have 



W U(t) m U{t) = ^U(t)^U(t) + ^U(t)^U(ty 



inserting r/t-, 



V(t) 



and using 1 1 1 eq. (III. 6)] yields 



w u(t) m u(t) = ^ bT %'(t)^u'(t)- 



Using ( fl9| ) and ( pi) , we obtain 



L u(t) 



( W U(t)) 1 ( W U(t) m U(t) 



°u 2 + l bT %(t) b ) b T % m tti(j, 



N 



(18) 
(19) 

(20) 

(21) 

(22) 

(23) 
(24) 



and the approximation ( |24| > becomes exact in the limit 

iV ^ oo. 



Using (IJ eq. (11.10)], we have 

^u'(t) = W(t) - 



L x(t) - " l x(t 
and using pT[ eq. (II. 8)], we have 

Vx'(t) + ^x(t) 







(25) 
(26) 

(27) 
(28) 
(29) 

Again, the approximations ( |26| > and ( p9| both become exact 
in the limit — > oo. Inserting ( po*] ) and ( |29] > into ( |24] ) yields 

Jim m &(t) = cr^ T (^x(t) + ^x(t)) (W(t) - ^x(t)) • 

(30) 

The mean of the a posteriori probability of C7(t) is thus well 
defined even for N — > oo and given by p0] ), and the theorem 
follows. □ 



While we have thus established that the mean ( |30] > of the 
a posteriori distribution of U(t) is well-defined for N — > oo, 
it should be pointed out that the variance of this distribution 
is infinite: taking the limit N — > oo of ( 19 1 yields Wfj (t) = 0. 



However, this seemingly problematic result does not imply 



that the estimate ( 17 1 is useless; it simply reflects the obvious 
fact that white noise cannot be fully estimated from discrete 
noisy samples. 



The nature of the estimate ( 17 1 is elucidated by the follow- 



ing theorem, which reformulates the estimation problem of 
this paper as an equivalent regularized least-squares problem. 
For the sake of clarity, we here restrict ourselves to scalar 
observations Yk where v = 1, c = C is a row vector, and 



o z = Vz is a scalar. (The general case is given in 1251.) 

Theorem 2. Assume that the factor graph in Fig. [2] consists 
of K sections between to and tx (with observations starting 
at t\) and assume that the observations Y). are scalars. Then 
the estimated pair (u(t), x(t)) with u(t) as in ( 17 1 minimizes 



u{tf dt 



K 

£ 

fc=i 



Vk 



cx 



subject to the constraints of the system model. 



(31) 



□ 



Proof: Recall the factor graph representation of a least 
squares problem as in Fig. [7] where the large box on top ex- 
presses the given constraints. Clearly, maximizing the function 
represented by Fig. [7] amounts to computing 



n 



n 



argmax 



e z k/( 2 <?k) = argmin V' . 



n 



1/4 (32) 



subject to the constraints. The right-hand side of (J32J will 
be called "cost function." Recall that sum-product message 
passing in cycle-free linear Gaussian factor graphs maximizes 
the left-hand side of (|32| (subject to the constraints) and thus 
minimizes the cost function | pT| . 

Now plugging Fig. [5] into the factor graph in Fig. [2] results 
in a factor graph as in Fig. |7] with cost function 
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Af(0,al) 



constraints 



Zo 



Fig. 7. Factor graphs of a least squares problem used in the proof of 
Theorem [2] 
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Fig. 8. Frequency response (magnitude) of the filters used in Section [VI] 



K 

E 

fc=i 



4/4 



N 



(V 



+ 



'TV 



u{t) 2 dt 



(33) 



□ 



which is ( (31} . 

According to Theorem[2] minimizing ( 3 1 1 is mathematically 
equivalent to the statistical estimation problems of this paper; 
in particular, modeling U(t) as white Gaussian noise amounts 
to regularizing the second term in ( f3"T| ) by penalizing power 
in u(t). 



The functional ( 3 1 1 is amenable to an informal frequency- 



domain analysis that considers the relative power in the 
different frequences of the input signal u(t). In particular, the 
estimate u(t) fits the corresponding output signal y(t) = cx{t) 
to the observations y k preferably by those frequencies that 
appear with little damping in the output signal. Since the 
transfer function from U(t) to Y(t) — cX(t) of the system ([!]) 
is necessarily a (non-ideal) low-pass filter, the estimate u(t) 
will contain little energy in very high frequencies. In this way, 
the spectrum of u(t) is shaped by the transfer function of the 
linear system. 

We also note|^] that the problem of minimizing ( 3 1 1 may be 
viewed as an offline control problem where an input signal 
u(t) is to be determined such that the resulting sampled 

output signal y\, 1/2, ■ ■ ■ follows a desired trajectory y\, §2, 

However, exploring this connection to control theory is beyond 
the scope of this paper. 

VI. Numerical Examples 

We illustrate the estimators of this paper by some simple 
numerical examples. In all these examples, the output signal 
Y(t) is scalar, we use regular sampling at rate f s , i.e., 
Yk = Y(k/ f s ), and the linear system in Fig.[T|is a Butterworth 
lowpass filter of order 4 or 6 with cut-off frequency (-3 dB fre- 
quency) f c J27) . The amplitude response (i.e., the magnitude 
of the frequency response) of these filters is plotted in Fig. [8] 

In these examples, we use the signal-to-noise ratio (SNR) as 
discussed in Appendix [C] Using ( [57} , the SNR of the discrete- 
time observations turns out to be 



SNR 



4 
2 
z 



(T 



fc ■ 2.052 



(34) 



This was pointed out to the authors by Andrew Singer of the University 
of Illinois at Urbana-Champain. 




Fig. 9. Estimation of output signal Y(t) from noisy samples (fat dots) 
at SNR = 10 dB. Solid line: estimate of Y(t) at correct SNR. Dashed line: 
estimation with assumed SNR 100 dB; dotted line: estimation with assumed 
SNR -10 dB. 



for the 4th-order filter and 



SNR 



4 



fc ■ 2.023 



(35) 



for the 6th-order filter. We will measure the SNR in dB (i.e., 
10-log 10 (SNR)). 

In some of these plots, the estimator deliberately assumes 
an incorrect SNR, i.e., an incorrect ratio afj/cr 2 z , in order to 
illustrate the effect of this ratio on ( |31) , 

Estimation of the output signal Y(t) is illustrated in Fig. [9] 
In this example, the linear system is a Butterworth filter of 
order 6. The noisy samples yt are created with f s = 10/ c 
at an SNR of 10 dB. The corresponding estimate of Y{t) is 
shown as solid line in Fig. [9] 

Also shown in Fig. [9] is the effect of estimating with 
an incorrect SNR, i.e., of playing with the ratio cr^/crl as 
mentioned above. Estimating with an assumed SNR that is too 
high results in overfitting; estimating with an assumed SNR 
that is too low reduces the amplitude of the estimated signal. 



Fig. 10 shows the effect of f s /f c on the normalized esti- 
mation error 



SNR, 



-1 * 



E 



(Y k - Y k f 



(36) 



for a Butterworth filter of order 4. For high SNR, we clearly 
see a critical "Nyquist region" where severe undersampling 
sets in. For large f s /f e , the estimate improves by about 
2.62 dB with every factor of 2 in f s /f c , which is less than 
what would be expected (viz., 3 dB) for strictly bandlimited 
signals (28J, (29). 



Estimation of the input signal U(t) is illustrated in Fig. 



11 



for exactly the same setting (with the same discrete-time 
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A. Additional Spectral Shaping 



2.0 



10.0 



100.0 



Fig. 10. Empirical estimation erro r )36) vs. normalized sampling frequency 
fs/ fc, parameterized by the SNR (54}, for a Butterworth filter of order 4. 
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Fig. 11. Input signal estimation for the same cases (and the same time scale) 
as in Fig. 15] The solid line (top) is the correct MMSE / LMMSE estimate of 
^CO- 



observations y k ) as in Fig. [9] The power and the spectral 
content for the three different plots in Fig. 11 illustrate the 



effect of the ratio ajjja z on (31 



VII. Extensions 

We briefly mention a number of extensions and modifi- 
cations of the system model that are required in some of 
the motivating applications and are easily incorporated in the 
estimation algorithms. 



The estimate (17) of the input signal U(t) is marked by an 
implicit spectral shaping (cf. the discussion after Theorem |2j. 

It may sometimes be desirable, however, to control the 
spectrum of the estimate more explicitly. This can be achieved 
by assuming that the input signal U(t) is not white Gaussian 
noise, but white Gaussian noise passed through a suitable 
(finite-dimensional) linear prefilter. The estimation of U (t) is 
easily adapted to this case by including the prefilter in the 
system model. 

In contrast to unfiltered-input estimation as in Section [V] 
estimation of a filtered input signal by means of Kalman 
filtering / smoothing is standard. 

B. Time-Varying and Affine Systems 

In some applications, the dynamics of the system /filter 
in Fig. [T] may change at discrete instants in time (but it is 
always known). This situation occurs, e.g., when the analog 
system /filter is subject to digital control. An example of such 
a case is given in p8) . 

We thus generalize the system model ([5]) and Q to 

dX(t) = (A k X(t) + b k U(t) + h k ) dt (37) 



and 



Y k = C k X(t k ), 



(38) 



which holds for t k < t < tfc+i, where A k and C k are known 
matrices, and where b k and h k are known column vectors. 

If h k = 0, both the factor graph representations and the 
message computation rules remain unchanged except for the 
addition of subscripts to the involved matrices and vectors. 
The case h k ^ is included below. 

C. Multiple Inputs and Internal Noise 

We are also interested in the case where the system /filter 
in Fig. [T] has internal noise sources. (Again, a main mo- 
tivation are analog-to-digital converters, where the noise in 
the analog part cannot be neglected.) Such internal noise 
can be handled mathematically by extending the input signal 
U(t) to a vector U(t) = (Ui(t), U 2 (t), . . . ) T , where the 
first component, U\{t), is the actual input signal while the 
remaining components model the internal noise. For t < t', 
the integral 



U{t) dr 



(39) 



is a zero-mean Gaussian random vector with diagonal covari- 
ance matrix afjlit' — t). The corresponding generalization of 
(gj is 

dX{t) = (AX(t)+BU(t) + h)dt, (40) 

where B is a matrix of suitable dimensions and where we 
have included a constant offset h (a column vector) as in ( |37| . 
Note that power differences and correlations among the input 
signals can be expressed by a suitable matrix B. 

The corresponding generalization of Table [I] is shown in 
Table |ll] The proofs are straightforward modifications of the 
proofs of Table [I] and are omitted. 



TABLE II 

Generalization of Table[I]to < [40] >. 



x(t x ) 



Til 



'X(to) 



x(t Q ) 
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5*x(t0 = eMtl ~ to) ^x ( t o) + A-\e A ^~^ - I)h (H.1) 
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(n.3) 



(II.4) 
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Fig. 12. Decomposition of the node/factor f(x(ti) \ x(to)) into N discrete 
time steps according to (53). 



Appendix A 
Proof of ( p~2] > and ( fl~4] > 



Let 



Ai 



A 



(44) 



If the matrix A is diagonalizable as in ( 1 1 1, then the integrals 
in ( |II.2[ ) and ( |H.4| i can be written as stated in the table where 
the square matrices 0(i) and 0(f) are given by 



\u -I- As V / 



and by 



A fe + A 



Al, -r- An \ ' 



A fe + A 



(41) 



(42) 



respectively, and where is the entry in row k and column 
I of the matrix 



(43) 



fA Nonlinearities 

Mild nonlinearities in the system /filter in Fig. [T] can often 
be handled by extended Kalman filtering J7J, f8J, i.e., by 
iterative estimation using a linearized model based on a 
tentative estimate of the state trajectory X(t). 

VIII. Conclusions 

We have pointed out that exact models of continuous-time 
linear systems driven by white Gaussian noise can be used 
in discrete-time factor graphs. The associated continuous-time 
signals then become computational objects that can be handled 
with arbitrary temporal resolution by discrete-time Gaussian 
message passing. 

Motivated by applications such as dynamical sensors and 
analog-to-digital converters, we have been particularly inter- 
ested in estimating the input signal, which does not seem to 
have been addressed in the prior Kalman filtering literature. 



From ( [TT| , we have 



e** = Qe Q 



and 
and thus 



„A'r _ (e Ar )T = (e Ar )H = (q-I^Atq 



1\H At,oH 



e AT bb T e A r dT = Ql^J e AT *e AT dr ) Q H (47) 



with 



¥ = Q- x b{Q- l b) H 



(45) 



(46) 



(48) 



The element in row k and column I of the matrix under the 
integral is 



/ At.t, At\ / (A fc +A»)T 

e We = tpk.e e y kT 11 , 

V J k,l 



(49) 



where tp^J. refers to the elements of the matrix V, and 
elementwise integration yields 



e AT *e Ar dr 



= J^ = L(x k +x t) t^ l \ (5Q) 
k,£ Afc + X( \ ' 



which proves ( 12 1. The proof of ( 14 1 follows from noting that 
changing e Ar into e~ Ar amounts to a sign change of A. 
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Fig. 13. Decomposition of e AT into N sections. 

Appendix B 

Proof of the Discrete-Time Decomposition in Fig. [5] 
We split the integral ^ into N parts, each of width T/N 
with T = ti- t : 

° AT X(t Q ) 

rkT/N 



X(h) 



N 

V 

k=1 J{k-l)T/N 



A( - T - T hU{t +T)dr (51) 



AT - 



X(t ) 

N r kT/N 
+ y^ e A(T- k T,N) h u{tQ 

k=l J(k-1)T/N 

° AT X(t ) 



T)dr (52) 



STe A V- kT ^b~U(t + kT/N), (53) 



fc=i 



where the approximation p2\ becomes exact in the limit 
N — > oo and where U(t) is defined as in (16 1. The factor 



graph of (53 i is shown in Fig. 12 



The term e AT can also be decomposed into N discrete steps 



as shown in Fig. 13 Plugging Fig. 13 into Fig. [12] yields a 
factor graph that is easily seen to be equivalent to Fig. [5] 

Appendix C 
On SNR 

For the system model of Section [TTJ we may wish to relate 
the input noise power <j\j to the signal-to-noise ratio (SNR) 
of the discrete-time observations. For the sake of clarity, we 
restrict ourselves to scalar observations Y^, i.e., v = 1, c = C 
is a row vector, and <j 2 z = Vz is a scalar. In addition, we 
assume that the continuous-time linear system is time-invariant 
and stable and any initial conditions can be neglected. In this 
case, we define 

E [ Y k] 



SNR 



(54) 



which (under the stated assumptions) is independent of k. We 
then have 



E[Yi]=c^ 



with 



X(oo) 



<j?t lim 



X(oo) c 



e AT bb T e ATT dT 



(55) 



(56) 



from ( |I.2| i; if, in addition, the system is diagonalizable as in 
([TTJ, then 

E [^fe 2 ] =4cQ^(oo)Q H c T (57) 
where 9(oo) is a square matrix with entries 



@{oo) k . e = — -= 

Afc + At 
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